On the Number of Partial Least Squares Components in Dimension Reduction for Tumor Classification

نویسندگان

  • Xue-Qiang Zeng
  • Guo-Zheng Li
  • Gengfeng Wu
  • Hua-Xing Zou
چکیده

Dimension reduction is important during the analysis of gene expression microarray data, because the high dimensionality of data sets hurts the generalization performance of classifiers. Partial Least Squares (PLS) based dimension reduction is a frequently used method, since it is specialized in handling high dimensional data set and leads to satisfying classification performance. This paper investigates the influence on generalization performance caused by the variation of the number of PLS components and the relationship between classification performance and regression quality of PLS on training set. Experimental results show that the number of PLS components for classifiers can be automatically determined by regression quality of PLS latent variables.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On partial least squares dimension reduction for microarray-based classification: a simulation study

In microarray tumor tissue classi'cation studies, the expressions of thousands of genes (variables) are simultaneously measured across a few tissue samples. Standard statistical methodologies in classi'cation do not work well when the dimension, p, is greater than the sample size, N . One approach to classi'cation problems, when p N , is to 'rst apply a dimension reduction method and then perfo...

متن کامل

Tumor classification by partial least squares using microarray gene expression data

MOTIVATION One important application of gene expression microarray data is classification of samples into categories, such as the type of tumor. The use of microarrays allows simultaneous monitoring of thousands of genes expressions per sample. This ability to measure gene expression en masse has resulted in data with the number of variables p(genes) far exceeding the number of samples N. Stand...

متن کامل

Orthogonal Projection Weights in Dimension Reduction based on Partial Least Squares

Dimension reduction is important during the analysis of gene expression microarray data, because the high dimensionality in the data set hurts the generalization performance of classifiers. Partial least squares based dimension reduction (PLSDR) is a frequently used method, since it is specialized in handling high dimensional data set and leads to satisfying classification performance. However,...

متن کامل

PLS dimension reduction for classification with microarray data.

Partial Least Squares (PLS) dimension reduction is known to give good prediction accuracy in the context of classification with high-dimensional microarray data. In this paper, the classification procedure consisting of PLS dimension reduction and linear discriminant analysis on the new components is compared with some of the best state-of-the-art classification methods. Moreover, a boosting al...

متن کامل

Sparse partial least squares classification for high dimensional data.

Partial least squares (PLS) is a well known dimension reduction method which has been recently adapted for high dimensional classification problems in genome biology. We develop sparse versions of the recently proposed two PLS-based classification methods using sparse partial least squares (SPLS). These sparse versions aim to achieve variable selection and dimension reduction simultaneously. We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007